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Abstract — Solution set compares the different denoising method such as box car av- 
erage, Wavelet transform, Fourier transform for denoising RF/AC interference inside the 
IgG protein mass spectra obtained by the linear ion trap mass spectrometer using fre- 
quency scan resonance ejection method. Further simulations are conducted to support 
the comparison between denoised signals and filtered noise components from the dif- 
ferent denoising methods. 


1.Introduction 


The filtering of data for the purpose of removing unwanted signals components is of in- 
terest to a wide variety of engineering and science disciplines. Additionally, denoising is 
also an important data processing method in improving the signal to noise ratio (S/N) of 
a spectrum. Denoising approaches adopted in spectrometry are to smooth spectra or 
average the noises in mass to charge ratio (m/Ze) domain and these techniques and 
methods are significantly found throughout the signal processing literature. Scholarly re- 
search has extensively explored various denoising methods and it is found that wavelet- 
based methods results in minimal distortion of original data. Proper selection of wavelet 
functions and decomposition levels, the noise embedded in the mass spectrometry data 
can be substantially removed. Based upon similar findings Here | would apply Fourier 
transform, wavelet transform and box average to draw similar sort of results. 


Remainder section of this paper is divided into the following sections; Section 2 de- 
scribes the Box car average its formulations and working. Section 3 describes wavelet 
transform (DWT), the Fourier transform (DFT) and the basics of transform domain filter- 
ing. Section 4 Introduces Signal to noise ratio (SNR). Section 5 detailed denoising algo- 
rithm is discussed, and simulation analysis and results is given in Section 6 Finally con- 
clusions are provided in the Section 7. 


2.Box Car Average 


Box car averaging is among the one of the first methods for the data smoothing it is per- 
formed by dividing the data into discrete equally spaced windows or boxcars. The data 
points in each boxcar are replaced by the centroid average of all the values in the win- 
dow.the core circuitry is similar to regular RC low pass filters that can be gated by a 
switch S. 


Boxcar averaging can be done for both the x values and y values. The equation for per- 
forming boxcar averaging in y-dimension is shown below: 


filtered,; = ue -nJi+j (1) 


filtered,; = the y value of the new centroid point. 
j = width of the box car window (j = 2n+1 points) 
Yi+j = ay value from the original data that is within the boxcar. 


Box car average is primarily effective in reducing the random noise in the signals via 
smooth out these random fluctuations by averaging them out with neighboring data 
points as well as in filtering out high frequency noise it acts as a low-pass filter in the 
frequency domain, attenuating these high-frequency components due to the averaging 
process. It also achieves a high signal-to-noise ratio (SNR) in a minimal amount of 
measurement time when working with low-duty-cycle signals also box car average 
used to enhance the computing speed. 
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Figure 01: Signal, Boxcar average, DWT, DFT Figure 02: Boxcar average, DWT, DFT noise 


3.Transform Domain Filtering 


A. The Discrete Wavelet Transform 


Wavelet thresholding denoising entails the assessment of the mathematical characteris- 
tics of wavelet coefficients, contrasting those between the target signal and noise signal. 
Typically, the wavelet coefficients of the desired signal have larger amplitude and con- 
centrated energy, while those of the noise signal tend to have a smaller amplitudes and 
more uniform energy distribution, therefore a specific threshold function can be set to 
distinguish between the desired signals and the noise signal. The choice of threshold 
significantly affects the denoising effectiveness. Traditionally threshold functions include 
hard thresholding and soft thresholding and both are having associated positives and 
negatives. 


The Discrete Wavelet Transform The discrete wavelet transforms (DWT) of a sequence 
x(n) is given by 


1 x (n-b 
WUD) =Inzgxeye("P) 2) 
where ¥ represents a wavelet function, which is dilated and contracted by the integer 
scale factor J, and delayed in time by parameter b. For an N point 


sequence the scale factor J assumes the values J = 0, 1,...log2(N), producing a multi- 
resolution decomposition of the input into octave bands. The delay values b is related to 
the scale by b = K- 2/ for K an integer. Thus, the DWT output is decimated by a factor of 
two at each successive octave J. The DWT requires an input sequence length that is an 
even power of two, i.e., N = 2? , and produces an equal number of wavelet coefficients. 
The DWT is a linear but time variant transformation s the inner product of Eq.2 pro- 
duces a DWT output W which is a set of N coefficients that represent the data in the 
wavelet domain. This set contains the information necessary to reconstruct the original 
signal from the corresponding wavelet function via the inverse wavelet transform 
(IDWT). The magnitude of the coefficients represents the correspondence between the 
input signal and the decomposing wavelet function at each particular delay b, and scale 
J. For simplicity and ease of display, the discrete wavelet coefficients can be repre- 
sented as a vector (W) by summing over the scales. 


W = [Wo ,W2, W3, wees Wy-1] (3) 
This formulation allows for plotting of the DWT output as shown in the figure 01. 


B. The Discrete Fourier Transform 


The Fourier transform can be subdivided into different types of transform. The most 
basic subdivision is based on the kind of data the transform operates on: continuous 


functions or discrete functions. Here | will deal with discrete Fourier transform (DFT). 
Precisely with FFT, the fast Fourier transform (FFT) is an algorithm for computing the 
discrete Fourier transform (DFT), whereas the DFT is the transform itself.the FFT here 
is used to denoise the input signal by removing the noise components in the frequency 
domain and then the thresholding step is used to separate the signal from the noise, 
and the IFFT is used to transform the filtered signal back to the time domain. 


The well-known discrete Fourier transform (DFT) of a sequence x(n) is given by 


X(k) = Y, x(n) exp) (4) 


Where the sequence [ X(0), X(2), ..., X(N-1)] is the transform sequence. 
The frequency index k = 0,..., N-1 is related to the analog frequency f and the sampling 
frequency fs by k =N f/f/s 


C. Noise Filtering 


Fig. 2 shows the noise component. Observation of Fig. 1 and Fig. 2, shows that the 
noise is distributed as small coefficients throughout both transform domains. The sepa- 
ration of signal and noise into large and small coefficients permits the application of a 
noise threshold to remove the smaller coefficients of the decomposition, those presuma- 
bly associated with the noise. The general method for calculating a threshold is based 
on the statistical properties of the transform coefficients. Estimation of the noise stand- 
ard deviation can be performed in a number of ways. An often-used approach is to com- 
pute the absolute median deviation of the coefficients. Once the noise level of the trans- 
formed data is established a threshold value can be set. A popular choice for the thresh- 
old level is the universal threshold, which is defined as: 


T=0/2log(N) (5) 
and thus, is a multiple of o (the noise standard deviation) 


After threshold selection various methods of application are discussed in the literature. 
Two of the most popular are hard thresholding and soft thresholding. Hard thresholding 
sets all coefficients below the threshold value to zero and retains the remaining coeffi- 
cients unchanged. Soft thresholding sets all coefficients below the threshold to zero and 
also reduces the magnitude of remaining coefficients by the threshold value. The trans- 
form noise removal technique can be described by three steps as displayed in the block 
diagram of Fig 3. (1) transform the noisy signal x(n) into the transform domain via the 
DFT or DWT, (2) threshold the transform coefficients (to remove noise), and (3) perform 
the inverse transform on modified coefficients to produce the filtered signal y(n). 


x(n) y(n) 
——+| DWT or DFT }-_ +} Leavy | _ gl IDWT or IDFT /--—> 


Figure 3: Block diagram of the three steps transform domain filtering. 


4.Signal to Noise ratio 


SNR or signal-to-noise ratio is the ratio between the desired information or the power of 
a signal and the undesired signal or the power of the background noise. 

SNR is a measurement parameter in use in the fields of science and engineering that 
compares the level of the desired signal to the level of background noise. 


SNR ratio is depicted using the relation: 


2 
SNR (db) = 10log,, (Se) (6) 
Hsignat is the mean amplitude of the signal. 
Onoise iS the standard deviation of the noise amplitude. 


Here’s a breakdown of what different SNR values meaning: 


5 dB to 10 dB: is below the minimum level 

10 dB to 15 GB: barely there. 

15 dB to 25 dB: is not great but considerable 
25 dB to 40 dB: is deemed to be good. 

41 dB or higher: is considered to be excellent. 


5.Denoising Algorithm 


The schematic diagram of the denoising algorithm for signals mixed with RF/AC noise is 
shown in figure 2. 
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Figure 03: Algorithm Denoising Flowchart 


The algorithm is adopted based upon the methods discussed in the previous sections in 
this solution set with the objective of removing RF/AC interference from the provided 
protein mass spectra. 


Input noisy signal is a 2D array so there is a prior need to separate it into m/z and inten- 
sity so that we can run the denoising methods properly. 


For box car average by visual inspection window size 12 is found to be a good fit for the 
plot as denoised signal having less distortion and signal features are preserved as well 
good amount of the noise is being removed by smoothing the signal, mean intensity is 
calculated by the np.convolve , it is used for convolution of two arrays combines two sig- 
nals by sliding one signal over the other, element-wise multiplying them, and summing 
the result resulting convolution is a smoothed version of the original data, where each 
value is the average of the neighboring values within the window. Thus it is working as 
moving average filter here. 


Fast Fourier transform is applied to convert the signal from the time domain to the fre- 
quency domain. resulting frequency-domain signal is then thresholded to remove fre- 
quencies with amplitudes below a certain threshold the filtered frequency-domain signal 
is then transformed back to the time domain using the inverse FFT (IFFT). The resulting 
signal is the denoised intensity data. 


The thresholding step is used to remove noise from the signal. By setting frequencies 
with low amplitudes to zero, the noise is effectively removed from the signal. The choice 
of threshold value = 0.178 is made using the universal threshold formula discussed in 
equation (5) determines the amount of noise that is removed. 


Wavelet family selection is made upon the signal characteristics and RF/AC interfer- 
ence in mass spectrometry data is more commonly characterized by high-frequency 
noise and periodic artifacts. Thus, usually sparse rather than smooth and slowly varying 
considering the characteristics and experimenting with different wavelet families, | found 
the Daubechies (db) wavelet family (db4) with decomposition level = 7 is most suitable 
due to its good time-frequency localization properties. 


| here used the standard wavelet transform using Py wavelets library, SWT is a wavelet 
transform that is similar to the discrete wavelet transform (DWT), but it is shift-invariant, 
meaning that it is not affected by the position of the signal. and after thresholding, the 
function applies the inverse SWT to reconstruct the denoised signal. 


Thresholding is done to the detail coefficients also said as high frequency components 
of the wavelet transform. Here thresholding is done using the universal threshold 
method proposed by Donoho & Johnstone also termed as visu-shrink method. often 
used in the context of wavelet shrinkage for denoising signals. The universal threshold 
is designed to remove noise by shrinking coefficients in a transformed domain 


A= o0J2logn (7) 


mad 
0.6745 


Where o = 


o is an estimate of the noise level in the data. 
nis the number of data points or the length of the signal. 
Median Absolute Deviation (MAD) 


Further | have to adopt between the hard threshold and soft threshold , Hard and soft 
thresholding with threshold , are defined as follows 


The hard thresholding operator is defined as D(U,A) = U for all |U| >A and ) otherwise. 
The soft thresholding operator can be defined as D(U,A) = sgn(U)max(0, |U|-A) 


Hard threshold is a “keep or kill” procedure. The alternative, soft thresholding shrinks 
coefficients above the threshold in absolute value. While hard thresholding may seem 
good, the continuity of soft thresholding has some advantages. Moreover, hard thresh- 
olding does not even work with some algorithms such as the GCV procedure. At times, 
pure noise coefficients may pass the hard threshold and appear as annoying ‘blips’ in 
the output. Soft thresholding shrinks these false structures. For protein mass spectra, 
where preserving the sharp peaks is crucial while reducing noise, soft thresholding is 
good to go with. Soft thresholding provides a balance by smoothing out noise while 
maintaining the overall shape of the peaks. 


Further after implementing the mathematical expression and thresholding of the de- 
noising method, | also printed the SNR ratios calculated according to the equation 6 
discussed in the previous sections. 


Lastly, | plotted separate figures using matplot library one for the denoised signals from 
the denoising methods and one for the noise components from the signals for a good 
comparison among the three. 


6.Results & Discussions 


The section describes the execution of the proposed methods. In the previous sections | 
explained about the algorithms. In this section, comparison of the different denoised 
methods have been explained, | here used the Python 3.12.4 version software to imple- 
ment the algorithms and | used the matplot library to represent the output of each de- 
noising method. 


Figure 4 display the original signal and the denoised signal from the boxcar average, 
DWT and DFT. Noise components from the respective denoising methods are displayed 
in the Figure 5. Signal to noise ratio is computed per Eq. 6 in each of the three method 
and is tabulated in the table 1. 


Comparison of the table 1. With the figure 4 and 5, reveals that although having the 
highest SNR, Fast Fourier transform (FFT) failed to denoise the data as comparted to 
boxcar and wavelet transform as its inability to check the discontinuity of a signal it 
treats the entire signal as a whole and doesn’t provide information about the local varia- 
tion and discontinuity, also unlike wavelet transform Fourier transform cannot analyse 
the signal at multiple resolutions which means it is unable to capture the localized 
changes in frequency domain. While wavelet transform make the temporal localization 
of wavelets, Fourier transformations (FFT) can only transforms time-domain data to the 
frequency domain 


All the above points support the noise embedded in the protein mass spectra data can 
be substantially removed using the wavelet transform with minimal distortion of original 
data. 


Table 01: Depicting Signal to noise ratio’s 


Denoising Method SNR 


Boxcar Average 53.43 
Wavelet Transform 57.74 
Fourier Transform 89.12 


Evaluating data after denoising compared to the raw data can provide valuable insights 
into the effectiveness of denoising techniques in improving data quality. increased sig- 
nal-to-noise ratio, reducing errors and enhancing data analysis outcomes. When com- 
paring denoised data to raw data. Also, Denoising can help reveal underlying features in 
the signal that might be obscured by noise. However, denoising has its own drawbacks 
denoising can modify the peak shape, height, or width, which might be important for the 
analysis, Over-denoising can also lead to a loss of information. If not properly done de- 
noising can smooth out the signal too much, making it difficult to detect changes and its 
features. Further on analysing the plots | can say that noise denoising has modified 
peak or shapes. however, the Wavelet Transform and Fourier Transform methods have 
preserved the peak shape and height better than the Boxcar Average method. The deci- 
sion to denoise a signal, especially when it modifies the peaks, depends on the context 
of our application and the specific characteristics of the signal we are working with. In 
short, the decision whether to denoise the signal depends on balancing the need for 
noise reduction with the requirement to preserve important signal features like peaks. 


Experimenting with careful evaluation of different techniques is the only way to find the 
best approach. 
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Figure 05: Noise Components from boxcar average, DWT, DFT 


6.Conclusions 


This work compared the performance of the boxcar average, wavelet transform and 
Fourier transform in denoising RF/AC interference from the data provided. The details of 
the methods are provided and simulations are presented. Healthy discussion is done 
also on the thresholding, signal to noise ratio and wavelet family selection, based upon 
results this work also is denoising good or bad! 
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